Duplicate detection of 2D-NMR Spectra
نویسندگان
چکیده
2D-Nuclear magnetic resonance (NMR) spectra are used in the (structural) analysis of small molecules. In contrast to 1D-NMR spectra, 2D-NMR spectra correlate the chemical shifts of 1H and 13C at the same time. A spectrum consists of several peaks in a twodimensional space. The most important information of a peak is the location of its center, which captures the bonding relationships of hydrogen and carbon atoms. A spectrum contains much information about the chemical structure of a product, but in most cases the structure cannot be read off in a simple and straightforward manner. Structure elucidation involves a considerable amount (manual) efforts. Using high-field NMR spectrometers, many 2D-NMR spectra can be recorded in short time. So the common situation is that a lab or company has a repository of 2D-NMR spectra, partially annotated with the structural information. For the remaining spectra the structure in unknown. In case two research labs are collaborating, the repositories will be merged and annotations shared. We reduce that problem to the task of finding duplicates in a given set of 2D-NMR spectra. Therefore, we propose a simple but robust definition of 2D-NMR duplicates, which allows for small measurement errors. We give a quadratic algorithm for the problem, which can be implemented in SQL. Further, we analyze a more abstract class of heuristics, which are based on selecting particular peaks. Such a heuristic works as a filter step on the pairs of possible duplicates and allows false positives. We compare all methods with respect to their run time. Finally we discuss the effectiveness of the duplicate definition on real data.
منابع مشابه
Fast Approximate Duplicate Detection for 2D-NMR Spectra
2D-Nuclear magnetic resonance (NMR) spectroscopy is a powerful analytical method to elucidate the chemical structure of molecules. In contrast to 1D-NMR spectra, 2D-NMR spectra correlate the chemical shifts of H and C simultaneously. To curate or merge large spectra libraries a robust (and fast) duplicate detection is needed. We propose a definition of duplicates with the desired robustness pro...
متن کاملNMR and vibrational spectra of 2-methoxycarbonyl-7-methyl-1,3-thiazino[3,2- b][1,2,4]triazine-4,8-dione: a joint of experimental and DFT
The IR and NMR spectra were coupled with quantum chemical calculations in DFT approach usingthe hybrid B3LYP exchange-correlation functional to confirm the structure of 2-methoxycarbonyl-7-methyl-1,3-thiazino[3,2-b][1,2,4]triazine-4,8-dione 2d.
متن کاملRapid acquisition of wideline MAS solid-state NMR spectra with fast MAS, proton detection, and dipolar HMQC pulse sequences.
The solid-state NMR spectra of many NMR active elements are often extremely broad due to the presence of chemical shift anisotropy (CSA) and/or the quadrupolar interaction (for nuclei with spin I > 1/2). These NMR interactions often give rise to wideline solid-state NMR spectra which can span hundreds of kHz or several MHz. Here we demonstrate that by using fast MAS, proton detection and dipola...
متن کاملSimilarity among tandem mass spectra from proteomic experiments: detection, significance, and utility.
Liquid chromatography paired with tandem mass spectrometry is a standard technique for identifying peptides from complex protein mixtures. Most fragment ion spectra acquired by this technique are unique, but some are repeated. Similarities among the spectra from 1D and 2D liquid chromatography experiments were calculated by the dot product algorithm. Similar spectra were grouped, and the degree...
متن کاملDetection and Characterization of Human Teeth Caries Using 2D Correlation Raman Spectroscopy
Background: Carious lesions are formed by a complex process of chemical interaction between dental enamel and its environment. They can cause cavities and pain, and are expensive to fix. It is hard to characterize in vivo as a result of environment factors and remineralization by ions in the oral cavity. Objectives: The development of a technique that gives early diagnosis which is non-invasi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Integrative Bioinformatics
دوره 4 شماره
صفحات -
تاریخ انتشار 2007